Visual Inference on English Women’s Football Data

12/9/2024

Charlotte Imbert

The Data

  • The data used in this analysis comes from the English Women’s Football Database, created by Rob Clapp

  • Contains data from every match played in the top division (since 2011) and the second division (since 2014) of professional English women’s football

The Data

library(tidyverse)
ewf_appearances<- ewf_appearances |>
  select(match_name, date, home_team, away_team, win, loss, draw)
head(ewf_appearances)
# A tibble: 6 × 7
  match_name                    date       home_team away_team   win  loss  draw
  <chr>                         <date>         <dbl>     <dbl> <dbl> <dbl> <dbl>
1 Chelsea Ladies vs Arsenal La… 2011-04-13         1         0     0     1     0
2 Chelsea Ladies vs Arsenal La… 2011-04-13         0         1     1     0     0
3 Lincoln Ladies vs Doncaster … 2011-04-13         1         0     0     1     0
4 Lincoln Ladies vs Doncaster … 2011-04-13         0         1     1     0     0
5 Birmingham City Ladies vs Br… 2011-04-14         1         0     1     0     0
6 Birmingham City Ladies vs Br… 2011-04-14         0         1     0     1     0

EDA

EDA

Question



Is there a home advantage in English women’s football?

Permutation

Null hypothesis: in professional women’s football, a team’s home status for a game has no influence on the outcome of the game

Test statistic: win proportion

set.seed(3)
permutation <- function(dataset, perm_n) {
  dataset |>
    group_by(match_name) |>
    mutate(perm_home_team = sample(home_team, replace=FALSE)) |>
    ungroup() |>
    mutate(perm_n = perm_n)
}

shuffles <- map_dfr(1:20, ~ permutation(ewf_appearances, .x))

Visual Lineup

Conclusions

  • No single plot stands out from the others as showing a much higher win proportion at home

  • Visually, the null sampling distribution is similar to the observed data

  • Plot 0 (observed data) has a higher proportion of wins at home than all of the other plots except number 2, which implies that this win proportion is entirely possible under the null

  • Visual lineup implies that there is no significant home advantage in English professional women’s football